TRAFFIC CONTROL METHOD AND SYSTEM 



FIELD OF THE INVENTION 

The present invention is in the general field of traffic management systems for 
communication networks. 



BACKGROUND OF THE INVENTION 

Networks are used to transfer voice, video and data 
devices. Network devices such as switches are located wjthin 
transfer of network traffic between, various devices. Network traffic 
nature. In order to compensate for network traffic bursts, memory 
into network devices. These allow the device to temporarily 
incoming rate is higher than an available outgoing rate. When 
traffic and each queue is contending for the same bandwidth, some o: 
to wait in the queues and some mechanism is needed to determine 
resolved. 

In order to resolve contention and provide a Quality of 
method of fair contention resolution to traffic, queue manageme^ 
implemented in the network devices. In one algorithm referred to 
contending queues are assigned different priorities and traffic is 
in strict priority order. For example, referring to Figure 1, four 
(designated 1-4 respectively) hold packetised traffic that is to be 
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Service guarantee or some 
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forwarded on link (20). 
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The link (20) has a finite bandwidth Rl that roust be shared by the traffic. To resolve the 
contention, the queues are prioritized and packets from the queues orM forwarded to the link 
(20) in strict priority order under the control of a queue manager (1 3), such as a switch. 
While priority queuing works well when there is little contention, wh are there is contention 
traffic from higher priority queues is forwarded at the expense of traffic from lower priority 
queues. In this situation, traffic from, high priority queues consume the majority of finite 
link bandwidth and lower priority queues are starved of bandwidth and back-up th e network 



packets are dropped, 
thereby defeating the 



yy which the network 
minimum bandwidth 



device architectures potentially increasing latency to the point where 
Dropped packets may require an entire stream to be retransmitted, 

10 purpose of prioritising queues in the first place. 

When traffic is to be carried in an ATM network a predetermined path called a 
6 Virtual circuit" is agreed between an initiating end point and nodes within the network such 
that, for the duration of the connection, the agreed traffic from the md point can use that 
particular path. When the path is established, a "contract" is made 

15 agrees to carry the traffic and to meet any quality of service and 

guarantees so long as the traffic stays within specified traffic descriptors. Traffic in an 
ATM network is formatted into equal sized cells or packets. 

Where a number of links are to contend for bandwidth, on a sirigle outgoing link, it is 
quite likely that some of the links will have an agreed contract fcr a proportion of the 

20 outgoing link's bandwidth. However, it is also quite likely that other links without 
contracted bandwidth on the outgoing link may also require access io the link. These are 
accepted on a "best-effort" basis, the outgoing link giving any uncointracted bandwidth to 
these links. Obviously, multiplexing of the traffic streams from the links onto the outgoing 
link must be managed in a fair way whilst satisfying the established Contracts. Fair sharing 

25 algorithms and round-robin type algorithms are based primarily on priority levels 
determined in dependence on the contracted bandwidth for each link. Thus, links without 
specific contracted bandwidth, such as those that have been accepted on a "best effort" 
basis are ignored whilst there is traffic pending in contracted links or whilst it is a 
contracted Jink's turn to transmit on the outgoing link. One way that has been, suggested to 

30 overcome this problem is to give those links that are not contracted d weight corresponding 
to a low priority so that the links are multiplexed onto whatever rema ns after the contracted 
links are multiplexed onto the Jink. However, such an arbitrary assignment of bandwidth 
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does not result in efficient sharing of the outgoing link. Furthermore, bandwidth on the 
outgoing link is reserved for each link in dependence of its weight, it respective of whether 
that link has anything to transmit Thus, a high priority link without any data to transmit 
takes up bandwidth on the outgoing link that could be used to tr ansmit data from the 

5 uncontracted links. 

One algorithm that attempts to resolve issues surrounding contention management 
without the above problems is "Weighted Fair Queuing" (WFQ). Contending queues are 
assigned weights and packets are forwarded from queues in proportion to the weights 
assigned to each queue. For example, referring again to Figure 1, if ti e queue manager (10) 

io was a Weighted Fair queue controller, the four queues would be assigned a weight that 
represents the amount of bandwidth that is reserved for that queue. If the total available 
bandwidth of the link were 100 bytes per second, then with queue weights assigned as 20%, 
25%, 15% and 40% to Qi, Q 2 , Q3 and Q 4 respectively, Qi would be allocated 20 bytes per 
second on the link, Q2 would be allocated 25 bytes per second, Q3 IS bytes per second and 

15 Q4 40 bytes per second. Hie queue manager ensures queues h^ve fair access to the 
outgoing link whilst satisfying the allocated bandwidths. In one implementation of the 
weighted fair queue algorithm, a linear array is defined. Each arraj' element represents a 
transmission time on the outgoing link. Queues are scheduled by Unking them to one of the 
elements in the array, the order of transmission being determined by 1 3.e order of the queues 



20 in the array. Once a transmission is made from a queue accordinj 



position of the queue within the array is recalculated. The recalculation schedules the queue 
further along the array, the exact position being calculated in dependence on the queues 
assigned weight 

Whilst the basic Weighted Fair Queue algorithm works! well for preventing 
25 starvation that occurs in priority queuing and establishes a maximum flow rate for each 



queue, link bandwidth is still wasted because the percentage of link 
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for a particular queue is reserved whether or not there are packets waiting. Furthermore, 
there is no apparent way of distributing excess bandwidth between other queues because 
queues do not have priority assigned relative to one another. 

In the past, the above problem is usually approached by thfe implementation of a 
scheduler based on a linear array, such as that for the weighted fair queue scheduler 
discussed above. However, one problem with such an approach is tiat in order to obtain a 



to the schedule, the 



bandwidth is reserved 
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high granularity (granularity is a measurement of the minimum bandwidth 
outgoing link that can be allocated) the array must be made very iar^e 
size of the array determines the minima] bandwidth (if the array is of 
bandwidth supported will be l/N*(Link Rate)). This is a particular 
5 devices since the addition of a new stream requires the wbol^ 

reconfigured, a task which could be impossible to perform without interrupting 
scheduling for large periods of time. 
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SUMMARY OF THE INVENTION 

jo According to one aspect of the present invention, there is provided a method of 

scheduling traffic from a plurality of queues onto a link, at least one oj 
agreed bandwidth requirement and at least one of the queues having : 
requirement, the method comprising the steps of: 

assigning a weight to each queue having an agreed bandwidth requirement, the 
1 5 weight being determined in dependence on the bandwidth requiremen 5 

grouping the queues having no agreed bandwidth requiremen : into a group, Q* ? and 
assigning a weight to the group Q*; 

scheduling ttie queues for transmission on the link in dependence on their assigned 
weight and on a last transmission time for the respective queue, wherein if a scheduled 
20 queue has no traffic to transmit another queue is scheduled, the group Q* being scheduled 
after the other queues. 

The present invention seeks to provide a traffic scheduling me hod and an associated 
system that is capable of scheduling a number of queues of traffic having an allocated 
proportion of the bandwidth of a link and one or more queues of traff c without an allocated 
25 bandwidth, A particular advantage of the present invention is ttat unused bandwidth 
allocated to a link is reclaimed and used for other links. Furthermore, the present invention 
seeks to provide such a method and system that is able to offer a higji level of granularity. 

Queues that do not have an agreed bandwidth, requirement (topically those accepted 
on a best effort basis or with very cheap link access contracts) are grouped into a group Q* 
30 and treated as a special queue. Effectively, the group is treated as one logical queue that is 
allocated unused bandwidth as and when it is available. 
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Whilst the present invention is primarily intended for use 
fixed size protocol data units (PDUs), such as cells in ATM (Asynchr 
networks, it could a)so be adapted for use in networks having variabl 
IP (Internet Protocol) networks. 
5 A weight for a queue having an agreed bandwidth requirement 

in dependence on the ratio of Hie queue's required bandwidth 
bandwidth, a queue with a low weight being scheduled for transmission before 
a higher weight 

A value, STEP, may be defined as the lowest assignable weight, wherein the weight 
10 W Nj , for a queue, Q N > is calculated as: 

N RN 

where Rj, is the link bandwidth and Rn is the queue's required bandwidth 

The group Q* may be assigned a weight of STEP. 

The step of scheduling queues may include the steps of: 
15 maintaining a global counter, G; 

maintaining a counter for each queue, counter Cn being the counter 

incrementing Cn by the Wk and G by STEP each time a 
for transmission and has traffic to transmit, wherein a queue, Q N , is 
transmission only if Cn < " G. 
20 The step of scheduling queues may further comprise the step 

queues in increasing rank of their respective weights, the group Q* 
wherein the step of scheduling queues processes the queues in accordance 
order. 

The method may further comprise the steps of assigning the global counter, G, a 
25 maximum, size in bits and determining an end point, U, 

wherein, when G reaches or exceeds the value of U, G is reset to a predetermined value, 
L, and counters Cn are reset to C N - (G-L) or 0, whichever is greater 
The predetermined value, L, may be set at 2 x STEP. 
30 The maximum usable weight may be set at U-STEP. 
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According to another aspect of the present invention, there is provided a traffic 
control system, comprising a traffic controller arranged to process traf :%c from a plurality of 
queue's to schedule the traffic on an outgoing link, the plurality of queues including at least 
one queue having an agreed bandwidth requirement and at least one queue having no 

5 agreed bandwidth requirement, the traffic controller being arranged to assign a weight to 
each queue having an agreed bandwidth requirement, the traffic cont -oiler detennining the 
weight in dependence on the bandwidth requirement, to group the queues having no agreed 
bandwidth requirement into a group, Q*, and assign a weight to the group, and to schedule 
the queues for transmission on the link in dependence on their assigned weight and on a last 

10 transmission time for the respective queue, wherein if a scheduled queue has no traffic to 
transmit another queue is scheduled, the group Q* being scheduled after the other queues. 

The traffic controller may determine a weight for a queue having an agreed 
bandwidth requirement in dependence on the ratio of the queue's required bandwidth to the 
available link bandwidth, the traffic controller being arranged to schedule a queue with a 

1 5 low weight before a queue with a higher weight. 

A predetermined value, S1EP, may be stored in a memory as the lowest assignable 
weight, wherein the weight W N , for a queue, Qn, is calculated as: 

W N =~xSTEP 

where Rl is the link bandwidth and Rn is the queue's required bandvidtbu 
20 The group Q* may be assigned a weight of STEP. 

The traffic controller may schedule traffic from the queues by: 
maintaining a global counter, G, in a memory; 

maintaining a counter for each queue in a memory, counter C * being the counter 
for queue Qn; 

25 incrementing Cn by the Wn and G by STEP each time a quei ie, Qn, is scheduled 

for transmission and has traffic to transmit, wherein a queue, Qn, is scheduled for 
transmission only if Cn < = G. 

The traffic controller may be arranged to order the queues in increasing rank of 
their respective weights, the group Q* being ordered last, wherein ti e traffic controller 
30 processes the queues in accordance with said order. 

The global counter, G, may be stored in a register of length n bits, the controller 
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being arranged to monitor the register for when its value reaches or exceeds a value, U P 
where 

U-2 (n4) 

wherein, when G reaches or exceeds the value of U, G is reset to a predetermined value, 
5 L, and counters Cn are reset to Cn - (G-L) or 0, whichever is greater. 

The traffic controller may include a data structure in a memory, the data structure 
including storage means for a link to each traffic element queued for transmission, an 
indicator as to the last transmission time for a queue and a schedule ibr each queue 
indicating the next transmission time for a queue, the traffic controller scheduling traffic 
10 in accordance with the contents of the data structure. 

The traffic controller may further comprise a further data structure, the further data 
structure being a copy of the data structure, wherein, upon receiving a further queue to 
schedule the traffic controller is arranged to recalculate a transmissic n schedule in the 
further data structure including the further queue and to then schedul e traffic in 
15 accordance with the contents of the further data structure. 

The traffic controller may comprise an Application Specific integrated circuit. 
The traffic controller may comprise a field programmable gate array. 
According to another aspect of the present invention, there is provided a 
computer-readable medium, on which is stored a computer program of instructions for a 
20 general purpose computer for scheduling traffic from a plurality of q ueues onto a link, at 
least one of the queues having an agreed bandwidth requirement anc at least one of the 
queues having no agreed bandwidth requirement, comprising, in combination: 

means for enabling the computer to assign a weight to each ^ueue having an agreed 
bandwidth requirement, the means determining the weight in depend ence on the bandwidth 
25 requirement; 

means for enabling the computer to group the queues having no agreed bandwidth 
requirement into a group, Q*, and to assign a weight to the group; 

means for enabling the computer to schedule the queues fj >r transmission on the 
link in dependence on their assigned weight and on a last transmission time for the 
30 respective queue, wherein if a scheduled queue has no traffic to transmit the means 
schedules another queue, the means scheduling the group Q* after the other queues. . 



BRIEF DESCRIPTION OF THE DRAWINGS 

For a better understanding;, the invention will now be described, by way of example 
only, with reference to the accompanying drawings, in which: 

Figure 1 is a schematic diagram of a traffic multiplexing systeifi: 
5 Figure 2 is a schematic diagram of a traffic multiplexing 

traffic control system according to the present invention; 

Figure 3 is a flow chart illustrating the scheduling operation oi 
to a preferred aspect of the present invention; and. 

Figure 4 is a block diagram of a data structure suitable for 
io system of the present invention in a real-time environment. 
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DETAILED DESCRIPTION OF A PREFERRED EMBODIMlW 

Figure 2 is a schematic diagram of a traffic multiplexing 
traffic control system according to the present invention. A number o 
15 3 are shown and designated 100-120) which are contracted with the 
occupy a respective bandwidth Rj-R^ of an outgoing link (140), arcs 
queues Qi-Qn- The sum of Ri to is less equal to the Jink bandwidth 
given a weight Wi toW N that is inversely proportional to the 
parameter, STEP, is set as the period in time it takes to transmit a s 
20 the lowest assignable weight. Therefore, if a link is given the w 

the bandwidth of the outgoing link Rj, and is always scheduled to transmit 
Note that Wf is the weight associated with link i according to 
Wj=Ri/Ri*STEP rounded upward. 
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25 It can. be seen if we wish to allocate a link half of th e bandwid 

the weight will be 2*STER 

A special queue, Q* (130), is also established. All streank 
contracted bandwidth, are routed to this queue. Cells from the special 
on a best effort basis after pending cells from the other queues have 

30 weight for the special queue (W*) is set at STEP, although the scheduler 
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the following formula: 



h of the outgoing link, 



that do not have a 
queue are scheduled 
:>een transmitted. The 
(150) knows not to 



treat this queue as a normal queue and therefore does not allocate all bandwidth to this 
queue. 

A global counter G is also maintained. The counter is incremented by STEP each 
cell time. A local counter (Cn) is also maintained for each link. fcvery time the queue 
5 corresponding to a link is selected to transmit, the corresponding counter C is incremerited 
by its weight 

In order to select die queue to be allowed to transmit a cell for a particular cell time, 
each counter Cj is compared with G. A queue, i, is allowed to transmit : only when Q<=G. 

However this is not enough, since if we process the queues in iin arbitrary order than 
10 the situation arises that no queue is allowed to transmit if the queues: are ordered one way 
but one or more queues would have been allowed to transmit if the q leues were ordered in 
some oilier way. 

Therefore, in order for the system to be work-conserving (a c ritical requirement for 
real-time switches and systems), the queues are ordered by their respective weights W;. 
15 Queues are ordered in increasing rank of Wj. However, the special queue Q* is always 
ranked last. 

For example, two queues Qi and Q2, having contracted bandwidths of 2Mb/s and 
IMb/s respectively may be controlled along with a number of best-effort streams queued at 
Q* for access to a link of 4Mb/s. STEP is set at 32, therefore Wi-64. 
20 Table 1 shows the progression, of the above scheduling algorithm for 
Transmission 1 2 3 4 5 6 



C 

Q 
C* 



W 2 =128andW*=32. 
Jus example. 
8 



32 64 96 128 



160 



in 224 



0^ 64 64^ 
0 128 
0 0 0 



128 128^ 192 

128 128 128^ 

0^ 32 32 
Table 1 



19 2 ^ 
256 



256 
256 
32' 



The tick symbols following counter Q values indicate the counter selected for 
transmission. It will be seen, that counter Ci is selected at transmissions 1, 3, 5 and 7, Cz is 
25 selected at transmissions 2 and 6 and counter C* is selected at transm ssions 4 and 8. 
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The lowest granularity is achieved when the rate of the stieam is high and its 
allocated weight is (STEP+1)/STEP. The granularity increases as the rate of a stream 
decreases. 

Figure 3 is a flowchart of the steps of a scheduling algorithm according to a 
preferred aspect of the present invention. 

It can be seen that the selection of the value of STEP has an ef3 ect on the granularity 
of the system. For example, if we have a link rate of 4Mbit/sec and wish to set a weight for 
an agreed rate of 3Mbit/sec, without STEP a weight of 4/3 rounded up (i.e. 2) would be 
given by prior systems. However, this would only result in a rate of 2Mbit/sec. With a 
value of STEP set at 32, the weight would be 4/3 x 32, rounded up, which gives 43. Thus, 
the resultant rate would be 2.977MbhVsec. 

Software implementations of scheduling algorithms run by a processor in a network 
device are acceptable in non real-time applications. In such implementations, the size of 
counters can be unlimited and the above scheduling method operates > veil. However, where 
the network device must multiplex traffic from ten of thousand;; of links and make 
scheduling selections in a fraction of a cell time, most software impl mentations are found 
to lack the required processing speed. In order to satisfy processing speed requirements, 
many schedulers are implemented in logic in ASICs (Application Specific Integrated 
Circuits) or FPGAs (Field Programmable Gate Arrays). In order t» meet the processing 
speed requirements and limit complexity, such implementations need to use limited 
numbers of logic elements which in turn limits the size of counters tiu t can be used. 

The algorithm illustrated in Figure 3 is simple enough to be implemented in such a 
limited number of logic elements whilst being flexible enough net to limit the size of 
counters. In order to avoid unlimited sized counters, each is allowed to wrap around. 
Assuming, for example, that G is a counter of 18 bits. In this case a value that occupies half 
of the counter (in this example 17 bits) is set as the counter's end point. When the counter 
reaches or exceeds this value, it is reduced to a pre-set value U. Tie maximum size of a 
weight is determined by the usable size of the counter (17 bits) loss the size of STEP. 
Therefore if STEP is set as 32 (5 bits) then weights can be a maximum of 12 bits long. A 
start value, L, for the counter G is selected, normally being 2xSTEP. 

At step 100, counters Ci to Cn are ordered in ascending Oder of Wj. Counter C* is 
placed last in the order. At step 110, the counter Q ordered first (lowest Wi) is compared 
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with G. If the counter Cj is less than or equal to G, the queue corresponding to the counter 
is selected to transmit a cell in step 1. 20. If the queue did not have a c dl to transmit then the 
next counter is selected. If the counter is not less than or equal to 3, the next counter is 
checked until one is found that is less than or equal to G. In step 130, the magnitude of the 

5 counter Q in relation to G is checked. If G minus the counter, Q, is greater than L, the 
counter Q is reduced to G minus L in step 140. This prevents a count sr for a queue that has 
been, inactive for some time dropping behind other counters due to t le inactivity and then 
consuming all available bandwidth until the counter catches bac 
uncontrolled burst of traffic. By increasing the value of the counter 

io G, the queue associated with the counter Ci is allowed to jump towards the front of the 
scheduling queue once after a period of inactivity and is then scheduled along with other 
queues. In step 150, the counter Ci is incremented by STEP, In step 160, G is incremented 
by STEP. Returning to step 110, if no counter is less than G, the algorithm jumps directly to 
step 160. 

is In step 170, G is compared to U. If G is greater than or equd to the maximum set 

value for the counter it is reset to L in. step 180 and all counters Ci to G* are decremented by 
U minus L or zeroed, whichever is higher. The algorithm then continues at step 100. 

Table 2 repeats the above example where the algorithm of Fij$ure 3 is applied, L is 
set at 64 and U is set at 256. 
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0 32 64 96 128 160 192 



224 256 



64 



C, 

c* 



0^ 64 64^ 
0 0^ 128 
0 0 0 



128 128^ 192 192^ 
128 128 128^ 256 
0^ 32 32 32 
20 Table 2 

In this example, at transmission 9, G reaches the limit set by U 
and Cj to C* are decremented by U-L (192). 

In order to configure the traffic controller when a new 
affecting the existing streams, a double memory is used, A copy 
25 controller configuration is made and updated to incorporate the new s 
is then directed to swap from the current configuration to the cop; 



256 256^ 128 
256 256 64^ 
32^ 64 0 

G is reset to L (64) 
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reverse roles, the copy becoming the current configuration and 
structure used for one of the memories is illustrated in Figure 4. 

A table 200 holds data on the physical ports of the network 
row for each cell queued for transmission and includes a column 201 
port is active, a column 202 indicating the port number and a columii 
in an order table 2 1 0 for the port number. 

The order table 210 includes a row for each transmission 
has a column. 21 1 indicating whether the row corresponds to the last 
of the particular port number, and a column 212 indicating a row in a 

The present invention has been described with a certain 
various alternations and modifications may be carried out without 
and scope of the following claims: 
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